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Abstract 

The probabilistic serial (PS) rule is one of the most prominent randomized rules 
for the assignment problem. It is well-known for its superior fairness and welfare 
properties. However, PS is not immune to manipulative behaviour by the agents. 
We initiate the study of the computational complexity of an agent manipulating 
the PS rule. We show that computing an expected utility better response is NP- 
hard. On the other hand, we present a polynomial-time algorithm to compute 
a lexicographic best response. For the case of two agents, we show that even an 
expected utility best response can be computed in polynomial time. Our result 
for the case of two agents relies on an interesting connection with sequential 
allocation of discrete objects. 

Keywords: Assignment problem, probabilistic serial mechanism, fair 
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1. Introduction 


The assignment problem is one of the most fundamental and important prob¬ 
lems in economics and computer science [see e.g., sms |T2,[l3|. In the setting, 
agents express preferences over objects and, based on these preferences, the 
objects are allocated to the agents. The model is applicable to many resource 
allocation or fair division settings where the objects may be public houses, school 
seats, course enrollments, kidneys for transplant, car park spaces, chores, joint 
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assets, or time slots in schedules. A randomized or fractional assignment rule 
takes the preferences of the agents into account in order to allocate each agent 
a fraction of the object. If the objects are indivisible but allocated in a random¬ 
ized way, the fraction can also be interpreted as the probability of receiving the 
object. Randomization is widespread in resource allocation since it is one of the 
most natural ways to ensure procedural fairness [8|. Randomized assignments 
have been used to assign public land, radio spectra to broadcasting companies, 
and US permanent visas to applicants [Footnote 1, 0. 

Among the various randomized/fractional assignment rule^ the pro babilistic 
serial (PS) rule is one of the most prominent rules [H,0-El,0, 14, 16, Hi, 17|. PS 
works as follows. Each agent expresses a linear order over the set of houses (we 
use the term house throughout the paper though we stress any object could be 
allocated with these mechanisms). Each house is considered to have a divisible 
probability weight of one, and agents simultaneously and with the same speed 
eat the probability weight of their most preferred house. Once a house has 
been eaten by a subset of agents, these agents proceed to eat their next most 
preferred house that has not been completely eaten. The procedure terminates 
after all the houses have been eaten. The random allocation of an agent by PS 
is the amount of each object he has eaten. Although PS was originally defined 
for the setting where the number of houses is equal to the number of agents, it 
can be used without any modification for fewer or more houses than agents [see 

e.g., s, mi- 

The probabilistic serial (PS) rule fares better than any other random as¬ 
signment rule in terms of fairness and welfare S Si, EES El- In particular, it 
satisfies strong envy-freeness and efficiency with respect to both stochastic dom¬ 
inance (SD) and downward lexicographic (DL) relations behh. sd is one 
of the most fundamental relations between fractional allocations because one 
allocation is SD-preferred over another if for every utility function consistent 
with the ordinal preferences, the former yields at least as much expected utility 
as the latter. DL is a refinement of SD and based on lexicographic comparisons 
between fractional allocations. Generalizations of the PS rule have been recom¬ 
mended in many settings [see e.g.,@[. The PS rule also satisfies some desirable 
incentive properties. If the number of objects is at most the number of agents, 
then PS is weak SD-strategyproof @. Another well-established rule random se¬ 
rial dictator (RSD) is not envy-free, not as efficient as PS !@j and the fractional 
allocations under RSD are #P-complete to compute (3[. However, unlike RSD, 
PS is not strategyproof. 

In this paper, we examine the following natural question for the first time: 
what is the computational complexity of an agent computing a different prefer¬ 
ence to report so as to get a better PS outcome? This problem of computing 
the optimal manipulation has already been studied in great depth for voting 
rules [see e.g., 11 [. Ekici and Kesten [lOl] showed that when agents are not 
truthful, the outcome of PS may not satisfy desirable properties related to ef¬ 
ficiency and envy-freeness. Hence, it is important to check that even if agents 
can in principle manipulate, how hard it is to compute a beneficial misreport of 
their preferences. The complexity of manipulation of the PS rule is also related 
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to the study of Nash dynamics and better responses. Efficient algorithms to 
compute best responses can be used to understand Nash dynamics under the 
mechanism. 

In order to compare random allocations, an agent needs to consider relations 
between them. We consider three well-known relations between random alloca¬ 
tions [see e.g., (i) expected utility (EU), (ii) stochastic dominance 

(SD), and (Hi) downward lexicographic (EL). For EU, an agent seeks a different 
allocation that yields more expected utility. For DL, an agent seeks an alloca¬ 
tion that gives a higher probability to the most preferred alternative that has 
different probabilities in the two allocations. Throughout the paper, we assume 
that agents express strict preferences, i.e., they are not indifferent between any 
two houses. 

Contributions. We initiate the study of computing best responses for 
the PS mechanism — one of the most established randomized rules for the 
assignment problem. The study is additionally motivated by complementing 
experimental work where we observe that as the number of houses relative to 
the number of agents grows, the percentage of manipulable profiles (for which 
at least one agent has incentive to manipulate) increases, maximizing at around 
99%. We present a polynomial-time algorithm to compute the DL best re¬ 
sponse for multiple agents and houses. For the case of two agents, we present 
a polynomial-time algorithm to compute an EU best response for any utilities 
consistent with the ordinal preferences. The two-agent case is also of special 
importance since various disputes arise between two parties. The result for the 
EU best response relies on an interesting connection between the PS rule and 
the sequential allocation rule for indivisible objects. In a sequential allocation, 
a picking sequence is specified for the agents and agent get his most preferred 
available object when his turns comes. For general n , we show that comput¬ 
ing an EU best response is NP-hard. The result contrasts sharply with the 
recent result of Bouveret and Lang Q that a best response can be computed in 
polynomial time for sequential allocation. 

2. Preliminaries 

An assignment problem (N, H, >-) consists of a set of agents N = {1,..., n}, 
a set of houses H = {hi ,..., h m } and a preference profile >-= (>~i,..., U n ) 
in which denotes a complete, transitive and strict ordering on H repre¬ 
senting the preferences of agent i over the houses in H. A fractional assign¬ 
ment is an (n x to) matrix [p(i)(/ij)]l<*<n.i<j<m suc h that for all i £ N, and 
hj £ H, 0 < p(i)(hj) < 1; and for all j £ {l,...,n}, = L 

The value p(i)(hj) is the fraction of house hj that agent i gets. Each row 
p(i) = (p(i)(h\),... ,p(i)(h m )) represents the allocation of agent i. A fractional 
assignment can also be interpreted as a random assignment where p(i)(hj) is 
the probability of agent i getting house hj. 

A standard method to compare random allocations is to use the SD (stochas¬ 
tic dominance) relation. Given two random assignments p and q, p(i) >- f D q(i) 
i.e., a player i SD prefers allocation p(i) to q(i) if Y^hj&{h k h k y h} p(i)(hj) > 
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'Eh J €{h k :h h y i h}9(i){hj) for all h e H and E h J e{h k :h k >- i h}P( i )( h j) > 
T, hj e{h k -.h k y i h} for some h G H. 

Given two random assignments p and q, p(i) yf L q(i) i.e., a player i 
DL (downward lexicographic) prefers allocation p{i) to q(i) if p(i) ^ q(i) and 
for the most preferred house h such that p(i)(h) ^ q(i)(h), we have that 
p{i){h) > q{i)(h). 

When agents are considered to have cardinal utilities for the objects, we 
denote by Ui(h) the utility that agent i gets from house h. We will assume that 
the total utility of an agent equals the sum of the utilities that he gets from 
each of the houses. Given two random assignments p and q, p(i) yf u q(i ) i.e., 
a player i EU (expected utility) prefers allocation p(i) to q(i) if E h£H u i(h) ■ 

P(i)(h) > E h£H u i( h ) ' <?WO)- 

Since for all i € N, agent i compares assignment p with assignment q only 
with respect to his allocations p{i) and q(i), we will sometimes abuse the no¬ 
tation and use p q for p(i) >-f D q(i). A random assignment rule takes as 
input an assignment problem (N, H , >-) and returns a random assignment which 
specifies what fraction or probability of each house is allocated to each agent. 


3. The Probabilistic Serial Rule and its Manipulation 


The Probabilistic Serial (PS) rule is a random assignment algorithm in which 
we consider each house as infinitely divisible [6, [l6 1. At each point in time, each 
agent is eating (consuming the probability mass of) his most preferred house 
that has not been completely eaten and each agent eats at the same unit speed. 
Hence all the houses are eaten at time m/n and each agent receives a total of 
m/n units of houses. The probability of house hj being allocated to i is the 
fraction of house hj that i has eaten. The following example adapted from 
[Section 7, jgj] shows how PS works. 

Example 1 (PS rule). Consider an assignment problem with the following pref¬ 
erence profile. 


>“i: hi,h2,hz 


2 : h,2,hi,h3 


^3 : h2,h3,h± 


Agents 2 and 3 start eating /12 simultaneously whereas agent 1 eats hi. When 2 
and 3 finish hi, agent 1 has only eaten half of hi. The timing of the eating can 
be seen below. 


Agent 1 
Agent 2 
Agent 3 


hi 


hi 


h3 


h2 

hi 

h3 

hi 

hs 

h3 





0 


The final allocation computed by PS is 


1 3 

Time 4 


PS(>- 1 , >- 2 , >- 3 ) — 
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Consider the assignment problem in Example [T] If agent 1 misreports his 
preferences as follows: then 

/1/2 1/3 l/ 6 \ 

^M,b 2 ) b 3 )= 1/2 1/3 1/6 . 

\ 0 1/3 2/3/ 

Then, if u\(hi) = 7, ui(h- 2 ) = 6 , and rti(/i 3 ) = 0, then agent 1 gets more 
expected utility when he reports In the example, although truth-telling is 
a DL best response, it is not necessarily an EU best response for agent 1. 

Examples 1 and 2 of Kojima [16] show that manipulating the PS mechanism 
can lead to an SD improvement when each agent can be allocated more than one 
house. In light of the fact that the PS rule can be manipulated, we examine the 
complexity of a single agent computing a manipulation, in other words, the best 
response for the PS rulcQ For a preference profile >-, we denote by (>-_*, >-() the 
preference profile obtained from >- by replacing agent V s preference by yt. For 
$ £ {SD, EU, DL}, we define the problem <f-BR: Given (N,H,y), compute 
a preference >-( for agent 1 such that there exists no preference >-" such that 
PS{N, >-f P5(AT,if,(^i,^_i)). 

For a constant to, the problem if-BR can be solved by brute force by trying 
out each of the to! preferences. Hence we will not assume that to is a constant. 

We establish some more notation and terminology for the rest of the paper. 
We will often refer to the PS outcomes for partial lists of houses and preferences. 
We will denote by PS{y ^, y-i){i), the allocation that agent i receives when his 
preference is according to ordered list L. Note that preferences and ordered lists 
are interchangeable, except that a list need not contain all houses in H. When 
an agent runs out of houses in his preference list, he stops eating. The length 
of a list L is denoted |L|, and we refer to the fcth house in L as L(k). In the PS 
rule, the eating start time of a house is the time point at which the house starts 
to be eaten by some agent. In Example [I] the eating start times of h±, h -2 and 
/13 are 0, 0 and 0.5, respectively. 


4. Lexicographic best response 


In this section, we present a polynomial-time algorithm for DL-BR. Lexi¬ 
cographic preferences are well-established in the assignment literature [see e.g., 
171 ll8|, l9(. Let (IV, H,y) be an assignment problem where N = {1,..., n} and 
H = {hi,... ,h m }. We will show how to compute a DL best response for agent 
1 £ N. It has been shown that when in < n, then truth-telling is the DL best 

' ’ ’ ’ ll. 


response but if to > n, then this need not be the case 17.118 


Recall that a preference >-( is a DL best response for agent 1 if the fractional 
allocation agent 1 receives by reporting >-{ is DL preferred to any fractional 


1 Note that if an agent is risk-averse and does not have information about the other agent’s 
preferences, then his maximin strategy is to be truthful. The reason is that if all agents have 
the same preferences, then the optimal strategy is to be truthful. 
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allocation agent 1 receives by reporting another preference. That is, there is no 
preference such that his share of a house h when reporting >-" is strictly 
larger than when reporting >-( while the share of all houses he prefers to h 
(according to his true preference >-i) is the same whether reporting >-( or y". 

Our algorithm will iteratively construct a partial preference list for the 
i most preferred houses of agent 1. Without loss of generality, denote >-i: 
hi ? ^2 j ■ * • s h m ■ 

For any i, 1 < i < m, denote Hi = {h \,..., hi}. A preference of agent 1 
restricted to Hi is a preference over a subset of Hi. For the preference of agent 
1 restricted to Hi, the PS rule computes an allocation where the preference of 
agent 1 is replaced with this preference and the preferences of all other agents 
remain unchanged. The notions of DL best response and DL preferred fractional 
assignments with respect to a subset of houses Hi are defined accordingly for 
restricted preferences of agent 1. 

For a house h G H, let PSl(L,h) = (PS(y} , ^_i)(l))(/i) denote the frac¬ 
tion of house h that the PS rule assigns to agent 1 when he reports the (partial) 
preference L. We start with a simple lemma showing that a DL best response 
for agent 1 for the whole set H can be no better and no worse on Hi than a DL 
best response for Hi. 

Lemma 1. Let i (E {1,... , m}. A DL best response for agent 1 on H gives the 
same fractional assignment to the houses in Hi as a DL best response for agent 
1 on Hi. 

Our algorithm will compute a list Lj such that Li C The list L t will be 

a DL best response for agent 1 with respect to Hi. Suppose the algorithm has 
computed Lj_i. Then, when considering Hi = Hi -1 U {hi}, it needs to make 
sure that the new fractional allocation restricted to the houses in Hi-\ remains 
the same (due to Lemma [l]). For the preference to be optimal with respect to 
Hi, the algorithm needs to maximize the fractional allocation of hi to agent 1 
under the previous constraint. 

Our algorithm will compute a canonical DL best response that has several 
additional properties. A preference Li for Hi is no- 0 if Li contains no house h 
with PSl{Li, h) = 0. Any DL best response for agent 1 for Hi can be converted 
into a no-0 DL best response by removing the houses for which agent 1 obtains a 
fraction of 0. For a no-0 preference Li for Hi, the stingy ordering for a position j 
is determined by running the PS rule with the preference Lj(l)ffi- ■ -®Li{j — 1) for 
agent 1, where © denotes concatenation. It orders the houses from (jj^j Li(k) 
by increasing eating start times, and when two houses h, h' have the same eating 
start time, we order h before h' iff h >-i h!. Intuitively, houses occurring early 
in this ordering are the most threatened by the other agents at the time point 
when agent 1 comes to position j. The following definition takes into account 
that the eating start times of later houses may change depending on agent l’s 
ordering of earlier houses. 


2 When we treat a list as a set we refer to the set of all elements occurring in the list. 
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Algorithm 1 DL best response for n agents 

Input: (N, H,y) 

Output: DL Best response of agent 1 

Li 4— hi // Best response for agent 1 w.r.t. Hi = {hi} 

for i = 2 to n do // Compute a best response w.r.t. H 2 , . . . , H n 

p <— 0 

if 3q £ {1, 1} such that 0 < P5'l(Li_i, Lj_i(g)) < 1 then 

p 4 — max{g £ {1, . . . , i — 1} : 0 < P51(Lt-i, Li-i(q)) < 1} 

end if 

for q <— p + 1 to | Li 1 + 1 do // New house hi inserted after position p 

L\ 4- Li_i(l) © • • • © Li_i(q - 1) © hi 

while |Lf| < |Li_i| do // Complete the list according to the stingy ordering 

est 4 - EST(iV, H, (Lf, + 2 , • • • , +n)) 

S 4— {h £ Li —1 \ L| : est(h ) is minimum} 
h s <— first house among S' in +1 
Li 4- L\ © h s 
end while 

if PSl(L?,hi) = 0 then 
Lf 4 - Li -1 

end if 
end for 

q <— p II Determine which Lf is stingy 

worse [p — 1] ■<— true 

finished <— false 

while finished = false do 

if 3 h £ Hi —1 such that PSl(Lf,h) PSl(Li_i,h) then 
worse [g] 4— true 

q <- q + 1 

else 

worse[g] <— false 

if PSl(Lf, hi) > 0 and PSl(Lf, hi) < 1 then 
if worse [q — 1] = false then 

q <- q - 1 

end if 

finished <— true 

else if PSl(Lf,hi) = 1 then 

est 4- EST(AT, P, (Lf (1) © • • • © Lf (q - 1), y 2 , . . . , + n )) 

if 3 h £ {Lf (q + 1), . . . , Lf (|Lf |)} such that est(h) < est(hi) then 

q <- q + 1 

else 

finished = true 
end if 
end if 
end if 
end while 
Li <- Li 
end for 
return L n 


A preference Li for Hi is stingy if it is a no-0 DL best response for agent 
1 on Hi, and for every j £ {1,...,*}, Li{j) is the first house in the stingy 
ordering for this position such that there exists a DL best response starting 
with Li{ 1) © • • • © Li{j). We note that, due to Lemma [L] there is a unique 
stingy preference for each Hi. 

Example 2. Consider the following assignment problem. 

>-i: h \, h‘2, hs, h^, / 15 . ha ++ 6 + 4 + 5 + 1+2 

The preferences / 13 , hi, hi, /12 and / 13 , h%, hi, hi are both no- 0 DL best responses 
for agent 1 with respect to Hi, allocating p(l)(hi) = l,p(l)(h 2 ) = l,p(l)(+) = 
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l/2,p(l)(h 4 ) = 1/2 to agent 1. When running the PS rule with h% as the 
preference list, hi’s eating start time comes first among {hi, / 12 , hi}. However, 
there is no DL best response for Hi starting with h^, hi. The next house in the 
stingy ordering is h±. The preference / 13 , hi, hi, /12 is the stingy preference for 
Hi. 

The next lemma shows that when agent 1 receives a house partially (a fraction 
different from 0 and 1) in a DL best response, a stingy preference would not 
order a less preferred house before that house. 

Lemma 2. Let Li be a stingy preference for Hi. Suppose there is a hj £ Hi 
such that 0 < PSl(Li, hj) < 1. Then, P C Hj, where Li = P © hj © S. 

The next lemma shows how the houses allocated completely to agent 1 are 
ordered in a stingy preference. 

Lemma 3. Let Li be a stingy preference for Hi. If hj, hk £ Hi are two houses 
such that PSl{Li, hj) = PSl(Li, hk) = 1, with Li = P © hj © M © hk © S, then 
either the eating start time of hj is smaller than hk’s eating start time when 
agent 1 reports P, or it is the same and hj >-1 hk- 

Proof. Suppose not. But then, Li is not stingy since swapping hj and hk in Li 
gives the same fractional allocation to agent 1. □ 

We now show that when iterating from a set of houses iL_i to Hi, the previous 
solution can be reused up to the last house that agent 1 receives partially. 

Lemma 4. Let L t _ 1 and Li be stingy preferences for Hi_i and H i} respectively. 
Suppose there is a h £ Hi- 1 such that 0 < PSl(Li-i, h) < 1. Then the prefixes 
of Li-i and Li coincide up to h. 

We are now ready to describe how to obtain Li from Li- 1 . See Algorithm [T| 
for the pseudocode. The subroutine EST(7V, H, >-) executes the PS rule for 
( N , H,)~) and for each item, records the first time point where some agent 
starts eating it. It returns the eating start times est(h) for each house h £ H. 

Let p be the last position in Li- 1 such that the house Lj_i(p) is partially 
allocated to agent 1. In case agent 1 receives no house partially, set p := 0 
and interpret Li-i(p) as an imaginary house before the first house of Lj_i. By 
Lemma 01 we have that Li-i(s) = Li(s) for all s < p. By Lemma [D we have 
that the fractional assignment resulting from Li must wholly allocate all houses 
Li-i(p + 1 ),..., Li-i(\Li-i\) to agent 1, and allocate a share of 0 to all houses 
in Hi- 1 \ Li- 1 . 

It remains to find the right ordering for {L i _ 1 (s) : p + 1 < s < |Li_i|} U 
{hi}. By Lemmas [2] and [3l the prefixes of 1 and L t coincide up to h. We 
will describe in the next paragraph how to determine the position q where hi 
should be inserted. Having determined this position one may then need to 
re-order the subsequent houses. This is because inserting hi in the list may 
change the eating start times of the subsequent houses. This leads us to the 
following insertion procedure. The list Lf obtained from Li- 1 by inserting hi 
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at position q, with p < q < \Lf\ + 1, is determined as follows. Start with 
Lf := Lj_i(l) © ••• ® Li_i(q - 1) © hi. While \Lj\ < |Lj_i|, we append to 
the end of L1 the first house among \ Li in the stingy ordering for this 
position. After the while-loop terminates, run the PS rule for the resulting list 
Li. In case we obtain that PSl(Ll , hi) = 0, we remove hi again from this list 
(and actually obtain Lf = Li- i). 

The position q where hi is inserted is determined as follows. Start with 
q := p. We have an array worse keeping track of whether the lists L\,... ,L\ 
produce a worse outcome for agent 1 than the list Li- i. Set worse[p — 1] := true. 
As long as the list Li has not been determined, proceed as follows. Obtain 
Li from Li-i by inserting hi at position q, as described earlier. Consider the 
allocation of agent 1 when he reports Lf. If this allocation is not the same for 
the houses in Hi-i as when reporting Li- i, then set worse[g] := true, otherwise 
set worse[q] := false. If worsefq], then increment q. This is because, by Lemma[l] 
this preference would not be a DL best response with respect to Hi. Otherwise, 
if 0 < PSl(Ll,hi) < 1, then we can determine hi s position. If worse[g — 1], 
then set Li := L|, otherwise set Li := L q ^ 1 . This position for hi is optimal 
since moving hi later in the list would decrease its share to agent 1. Otherwise, 
we have that worse[g] = false and PSl(L q i7 hi) £ (0,1}. This will be the share 
agent 1 receives of hi. If PSl(Ll,hi) = 0, then set Li := Li- 1 . Otherwise 
( PSl(Ll,hi) = 1 ), it still remains to check whether the current position for 
hi gives a stingy preference. For this, run the PS rule with the preference 
L|( 1) ffi • • • © L q t (q — 1) for agent 1. If hi s eating start time is smaller than the 
eating start time of each house L^(r) with r > q, then set L t := Li, otherwise 
increment q. 

Thus, given Li- 1 , the preference Li can be computed by executing the PS 
rule 0(m) times. The DL best response computed by the algorithm is L m . 
Since the PS rule can be implemented to run in linear time 0(nm ), the running 
time of this DL best response algorithm is 0(nm 3 ). 

Theorem 1. DL-BR can be solved in 0(nm 3 ) time. 

Example 3. Consider the following instance. 

>“i: hi, ft- 2 , h' 3 , / 14 , ft. 5 , he, h-j, hs, hg, h\g 

>~2'. hg, he, he, hg, h±o, hi, he, hj, / 14 , hg 

^ 3 : hg, hi, h-j, hi, h.2,he, he, he, hg, hw 

After having computed L 2 = hi,h 2 , the algorithm is now to consider Hg. Since 
PS1(L2, hi) = PS1(L2, hg) = 1, the algorithm first considers L\ = hg,h 2 ,hi. 
Note that hi and h .2 have been swapped with respect to L 2 since agent 2 starts 
eating hg before agent 3 starts eating hi when agent 1 reports the preference 
list consisting of only hg. It turns out that PSl(L\,hi) = PSl(L\,hg) = 
PS\{L\,hg) = 1. Thus, worse[l] = false. Since hg does not come first in the 
stingy ordering, the algorithm needs to verify whether moving hg later will still 
give a DL best response with respect to Hg. It then considers L§ = hi,hg,hg. 
However, this allocates only half of hg to agent 1, implying worse[2] = true. Since 
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worse[l] = false, the algorithm sets Lj, = L\. The DL best response computed by 
the algorithm is L\q = h$, / 12 , hi, he- 

We note that a DL best response is also an SD best response. One may wonder 
whether an algorithm to compute the DL best response also provides us with an 
algorithm to compute an EU best response. However, a DL best response may 
not be an ELI best response for three or more agents. Consider the preference 
profile in Example [l] Since the number of houses is equal to the number of 
agents, reporting the truthful preference is a DL best response [3- However, 
we have shown a different preference for agent 1 where he may obtain higher 
utility. 


5. Expected utility best response 

In this section, we consider the problem of expected utility best response. 


5.1. Case of two agents 

We first show that for the case of two agents, an EU best response can 
be computed in linear time. The result hinges on a close connection that we 
identify between PS and discrete allocation of objects to agents via sequential 
allocation. In the sequential allocation setting ( N , O, >-', 7r), there is an agent set 
N, an object set O = {cq,... o m '}, a preference profile >-' that specifies for each 
agent i £ N his preferences >-' over O, and a policy 7 r : {1,..., to'} —> N. The 
sequential allocation rule works as follows. Starting from j = 1 to to', agent 7 r(j) 
gets his most preferred object that is not yet allocated. If no unallocated object 
is on the preference list of the agent, then the agent does not get any object when 
his turn comes. The assignment as a result of sequential allocation is denoted 
by SA(N, O, >-', 7r). We will restrict ourselves to the case where N = {1, 2} and 
will only consider the alternating policy 7r* = 1212... in which agent 1 starts 
first and then the agents keep alternating. The sequential allocation setting was 
introduced by Kohler and Chandrasekaran [15J where they showed that the best 
response can be computed in linear time when |iVj = 2 and the policy is the 
alternating sequence. Recently, Bouveret and Lang |7j generalized their result 
to the case of any number of agents, any policy, and where the manipulator may 
be indifferent between objects. 

We highlight a close connection between sequential allocation and PS and 
thereby between allocation mechanisms for indivisible and divisible houses. For 
the random assignment setting ({1, 2 }, H , >-), the half-house reduction gives us 
the sequential allocation setting ({1, 2}, O, 7 r*). In the reduction, each house 
hj € H is cloned so that we have two half-houses /ij and hf- for each house hf. 
0 = {h},h]:j = 1 ,... , m}. Both agents have preferences over half-houses that 
are consistent with their preferences over houses and for each house, each agent 
prefers the first half-house slightly more than the second half-house: if hj >-* hk, 
then /i] >-} h? >-[ hi >-■ /i|. We show that for n = 2, the assignment under 
PS is ‘essentially’ the same as the assignment obtained by applying sequential 
allocation to the setting resulting from the half-house reduction: 
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Remark 1 . The assignment PS({1, 2}, H, and the assignment 
iST4({1,2}, O, y', 7 r*) are related as follows: PS({1,2}, H,y)(i)(hj) = 

\ • (Si4({l, 2 }, o, y f , n*)(i)(h]) + SA({1, 2 }, O, y', n*)(i)(h.])). 

We note that in the half-house reduction, each preference list y' satisfies the 
consecutivity property: half-houses corresponding to the same house are placed 
consecutively in the preference list. We will use the consecutivity property in 
our argument. 

Theorem 2. For the case of two agents, an EU best response can be computed 
in linear time. 

Proof. We consider the EU best response problem for PS where the manipula¬ 
tor, agent 1, has preferences >-i: hi ,..., h m . The main idea is to reduce the EU 
best response problem ({1,2 },H,y) for PS to the EU best response problem 
({1, 2}, O, W, 7 T*) for sequential allocation. The reduction is a slight modifica¬ 
tion of the half-house reduction with the difference that agent 1 is indifferent 
between two half-houses corresponding to the same house. The object set is 
O = {hj,tij : j = 1 ,...,m}. In W, both agents have preferences over half¬ 
houses that are consistent with their preferences over houses. We will assume 
without loss of generality that agent 2 prefers the first half-house slightly more 
than the second half-house. Agent 1 is indifferent between any two half-houses 
corresponding to the same house: /i] h‘j for all j £ { 1 ,..., to} but will be re¬ 
quired to report strict preferences. When we consider sequential allocation, we 
will view it in rounds so that in each round, first agent 1 picks a most preferred 
available house and then agent 2 picks a most preferred available house. 

In the algorithm by Bouveret and Lang Q, when agents have strict prefer¬ 
ences, it is checked whether the manipulator (agent 1) can get different target 
sets of objects. In the algorithm, only a linear number of target sets need to be 
considered. Given target set Tk which is restricted to objects from or,..., Ofc, we 
can compute target set Tk+\ as follows: check whether target set Tk U {ofc+i} 
can be achieved or not. Tk+i = Tk U { 0 ^+ 1 } if Tk U {ofc+i} can be achieved 
and Tk+\ = Tk otherwise. T m is then the most preferred allocation that agent 
1 achieves and the allocation is unique. When the manipulator is indifferent 
among objects, Bouveret and Lang |7j showed that their algorithm can be eas¬ 
ily modified as follows: agent 1 considers a linear order instead of his actual 
weak order where the linear order is achieved by breaking ties between the in¬ 
different objects in the same order as the preference of agent 2. Based on this 
insight, observe that both agents will pick Tij before hj for any j £ (1 ,... ,to} 
if they report truthfully. 

We first show that there exists a best response of agent 1 in the sequential al¬ 
location setting (TV, O, £ 3 ', 7 r*) that satisfies the consecutivity property. If agent 
1 either gets both half-houses corresponding to a house or none of them, then his 
optimal preference report for sequential allocation trivially satisfies the consec¬ 
utivity property. If this is not the case, then let us consider the most preferred 
house hj for which agent 1 gets one of the corresponding half-houses but not the 
other. If agent 1 only gets Ti* but not /i|, this means that in his best response for 
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houses restricted to {h \,..., hj}, hj was already taken by agent 2 in a round in 
which agent 1 picked some other object. Then agent 1 can eventually insert fij 
immediately after hj in his best response preference knowing well that he will 
not get hj. Thus, the best response for sequential allocation can be modified so 
that it satisfies the consecutivity property and yields the same optimal alloca¬ 
tion. Now consider the case where agent 2 gets hj but agent 1 gets hj. Then 
this means that agent 1 cannot get hj in his best response when his preference 
is restricted only to houses from the set {hj, h \,..., hj_ 1 ,hj_ 1 ,hj}. Therefore, 
agent 1 can still insert hj eventually just before hj in his best response so that 
the consecutivity property is satisfied and the allocation does not change even 
though agent 1 does not get hj in his best response. 

We now show that the best response of agent 1 in the sequential allocation 
setting (N, O, £/, 7r*) can be used to compute the best response of agent 1 in 
(N, H, >-) under PS. Let U be the expected utility for agent 1 under his best 
response in the PS setting. The best response X* corresponds to >-*' over 
the set of half-houses. By Remark [TJ agent 1 achieves essentially the same 
allocation and hence the same utility U in the sequential allocation setting if 
he submits preference . Conversely, if agent 1 achieves utility U in the 
sequential allocation setting via a preference report, then he achieves at least 
as much utility by reporting his optimal preference y}' constructed via the 
algorithm of Bouveret and Lang Q. Hence, the preference >-*' can be modified 
as shown above so that it satisfies the consecutivity property. In this case, 
there exists a preference >-* over H which is consistent with the preferences 
over O. If agent 1 reports >-*, then he gets essentially the same allocation as 
Sb4({l, 2}, O, (V^, >-2X1) and thus gets utility U. □ 

The best response algorithm of Bouveret and Lang [?] returns the same 
optimal preference report for all cardinal utilities consistent with the ordinal 
preference of the manipulator. Next, we point out that for the case of two agents 
and the PS rule, a DL best response and an EU best response are equivalent. 

Proposition 1. For the case of two agents and the PS rule, a DL best response 
is an EU best response and an EU best response is a DL best response. 

Proof. For two agents, PS assigns probabilities from the set {0,1/2,1}. Hence 
DL preferences can be represented by EU preferences where the utilities are 
exponential: the utility of a more preferred house is twice the utility of the next 
most preferred house. Hence a response is a DL best response if it is an EU 
best response for exponential utilities. On the other hand we have shown that 
for two agents and the PS rule, an EU best response is the same for any utilities 
compatible with the preferences. Hence for two agents, an EU best response for 
any utilities is the same as the EU best response for exponential utilities which 
in turn is the same as a DL best response. □ 

5.2. General case 

We show that an EU best response is NP-hard to compute. The result 
contrasts with Theorem[l]which states that a DL best response can be computed 
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in polynomial time. 

Theorem 3. EU-BR is NP-hard. 


Proof. To show hardness we show that the following problem is NP-complete: 
given an assignment setting as well as a utility function u : H —>• N specifying 
the utility of each house for the manipulator (agent 1) and a target utility T, 
can the manipulator specify preferences such that the utility for his allocation 
under the PS rule is at least T? We reduce from a restricted NP-hard version 
of 3SAT where each literal appears exactly twice in the formula. Given such a 
3SAT instance F = (A', C) where A' = {cci,..., x n } is the set of variables and C 
the set of clauses, we build an instance of EU-BR where the manipulator can 
obtain utility > T if and only if the formula is satisfiable. At a high level, we 
will create an instance of the assignment problem which can be conceptualized 
as 18 (mostly) disjoint parts that we index by D £ {1,..., 18}. We will describe 
the main (first) part in detail and explain how it is duplicated to create the 
other 17 parts. Each of the 18 parts is divided up into n choice rounds which 
we index from 1 to n. For each part there is an additional clause round. The 
18 parts are linked by a special set of houses which allow us to synchronize the 
timing of the manipulator with respect to all the other agents. The set of agents 
IS N= {1} U U^Kumray} U \Jd=1 ^nterais where the manipulator is Agent 
1, 17 ‘dummy’ manipulators for the 17 copies of the main part, and two agents 
for each positive and negative literal in the formula for each of the 18 parts, 


A D = in 1 ’ 0 

^literals l^a: 


,2 ,D n l,D 


a^ x . '■ x i e A'}. 


The set of houses is H = ff s lo W UU D=1 tf r ou„d S U UD=l ff clause U Ul)= 2 i h cp}U 
{Aprize} where H s \ ow = {h r s : r £ {1 ,,n — 1}} is the set of slowdown houses 
that will be used to control the timing of the manipulator’s decisions. Note 
that there is only one slowdown house per round and these houses are shared 
between all 18 parts. H° unds = {h r x f, h r ^. : r £ {1 ,..., n}, i £ {1,..., n» is a 
set of houses consisting of one house for each positive and negative literal in the 
formula for each of the n rounds; H c \ ause = {h\' D , hl’ D , h? c ’ D : c £ {1 ,..., C}} 
is a triplet of houses for each clause in the formula; /i pr ; ze is the prize house for 
the manipulator; and Ud= 2 I ^cp) the se ^ °f consolation prize houses for the 
dummy manipulators. 

We will describe how to construct the preferences for the main part which 
contains the manipulator, agent 1, and then explain the small differences neces¬ 
sary to create the 17 other duplicate instances. Example 0] gives an illustration 
of the main part of a small instance and may be helpful for reference during the 
discussion. 


Main part. We will describe the rounds by declaring which houses are eaten 
in them and show how the preference lists of the agents are constructed. Each 
agent’s preference list can be described has having a head and a tail. To ease 
the description, we will omit the round index D = 1 in the variable names. 
Intuitively, the head consists of the houses that the agent will consume during 
the running of the PS algorithm while the tail consists of houses that will not 
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be eaten. When we describe how we add houses to an agent’s preference list, 
we will say append the house(s) to the head to mean add this set of houses to 
the end of the head of the preference list, behind those that have been placed 
before. We say append the house(s) to the tail of the preferences to mean place 
them last amongst all houses which have been placed in the preferences so far. 

In each choice round r, houses h x . and hZ, x . for each i £ {l,...,n} will 
be eaten. Append those houses to the head of the preferences of the agents 
corresponding to the same literal and append them to the tail of the preferences 
of agents associated to a different literal. Append houses h x and h r _, x to the 
head of the manipulator’s preferences (the order in which we add them in is not 
important). Houses k x . and h r ^ x . where i/r are appended to the tail of the 
manipulator’s preferences. In each choice round except the last one, slowdown 
house hg will be eaten. We append it to the tail of the preferences of the literal 
agents, and to the head of the preferences of the manipulator agent (after the 
literal houses we added for this round). 

Finally we describe the clause round. For each clause, we have the 3 houses 
hi, h%, hi. We append these 3 houses to the head of the preferences of exactly 1 
agent corresponding to the negation of each of the clause c’s literals. If an agent 
has already had houses added to his preferences in the clause round, we add 
them to the other agent corresponding to the same literal (since a literal appears 
only twice in the formula, this ensures each agent has only one triplet of houses 
appended to the head of their preferences). The prize house h pr i ze is appended 
to the head of both the manipulator’s and the literal agents’ preferences (after 
the clause houses we just added to the literal agents). 

Duplicate parts. For each of the duplicate parts, D £ {2,..., 18}, we will de¬ 
scribe the necessary modifications. For clarity we call the copy of the prize house 
in the duplicated parts of the instance consolation prize houses denoted h^p for 
each D £ (2,. ..,18}. Recall that the set of slowdown houses H s i ow is shared 
between all the parts; thus all the parallel constructions ‘merge’ at the set of 
slowdown houses. We are left with the fact that houses from a given duplicate 
part D of the instance have not been added to the preferences of agents from 
all other parts of the instance. We can append all these houses to the tail of 
the preferences of the agents outside this part in any order. 

The manipulator’s utilities. We will give the manipulator’s utility in terms of 
a number a to be fixed later. The prize house has utility 1. The literal houses 
that are appended to the head of the manipulator’s preferences during round i 
(h x . and h\ x .) have utility (2a) 2( - n ~^ and (2a) 2 (™ - *) + e where e is 0(£ r ). The 
slowdown houses have utility (2a) 2 ( n- *~b +1 . All other houses have negligible 
utility. By negligible we mean that adding up all their combined utilities will 
yield less than ^ utility. This can be done since we have a polynomial number 
of houses and we can make the utilities exponentially small. 

Based on these utilities we can now derive a target value for T and anal¬ 
yse the behaviour that the manipulator must have to reach that target. The 
manipulator may only start eating a new house once the house he is currently 
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eating is no longer available. This means that if he starts eating a house, he 
is ‘stuck’ eating said house for a certain amount of time. We now constrain 
the manipulator’s possibilities by showing that by diverging from the literal 
and slowdown houses he should be eating according to his preferences, he will 
commit to a house for which he has exponentially less utility for an amount of 
time which is at least some constant. By setting a to be large enough, we can 
ensure that this loss in utility is irrecuperable. We say the manipulator behaves 
as prescribed if he declares preferences which correspond to his true preferences 
up to permutations of the literals associated with one same variable. 

Let t\ > 0 be the smallest amount of time the manipulator will eat a new 
house if he has behaved as prescribed in all his previous choices. The next 
lemma shows that t\ is independent of the instance size. 

Lemma 5. t\ € 0(1). 

Pi'oof. As the algorithm progresses, we may group the agents in a constant 
number of groups based on the extent they have eaten their current house when 
the manipulator finishes consuming one of his houses and the number of agents 
eating that house. Each group is associated with a value, which corresponds to 
the amount of time the manipulator would have to spend if he decided to eat 
a house currently being eaten by members of that group. By showing that the 
number of these groups is constant, and therefore so is the number of values, we 
show that t\ is a constant. The groups can be characterized by the type of house 
that the members are eating. At any point in the algorithm we say that a literal 
has been chosen by the manipulator if the round r is greater than the index i 
of that literal, r > i. We say that a literal is untouched by the manipulator for 
i > r. The groups are defined as follows: 1) Agents eating houses being eaten by 
an agent corresponding to a literal which has been chosen by the manipulator 
2 ) Agents eating houses being eaten by an agent corresponding to a literal 
which is the negation of one chosen by the manipulator 3) Agents eating houses 
corresponding to literals untouched by the manipulator 4) Agents eating houses 
being eaten by dummy manipulators. 

At the start of any round i, eating a house from group j would take gf time. 
The manipulator then finishes eating the first literal, and eating a house from 
group j would take g\ time. After eating the second literal, eating a house from 
group j would take g° 3 time. Finally the manipulator eats the slowdown houses 
and we have corresponding value g\. We will now show that the values for g] 
are the same for all rounds. To show this we simply need to make sure that 
all the agents stay ‘synchronised’. It takes the manipulator 0.5 units of time 
to finish the current round on the first literal, ^ on the second, and ■A on 
the slowdown house). Let us now show that it also takes 0.5 units of time for 
every other group to get to the same point in the next round. The exception are 
the agents eating a house that is also being eaten by the manipulator or some 
dummy in that round, which fall out of sync with their previous group (group 
3 or 4) and transit either to group 1 or 2. For groups 1-3, all these agents pair 
up and have 1 house per round. It therefore takes them each 0.5 time to eat 
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it. For Group 4, the dummy manipulators eat a first literal (i) then a second 
(i) and finally all 18 manipulators join together and eat the slowdown houses 
in the round, which takes them time This adds up to ^ = 0.5. □ 

Corollary 1. There is value for a £ 0(1) such that the manipulator behaves 
as prescribed. 

Lemma 6. In the clause round all agents corresponding to literals chosen by the 
manipulator staid the round at the same time as the manipulator, whilst agents 
corresponding to negation of the choice of the manipulator are in advance and 
start the round i units of time before the manipulator. 

Proof. In Lemma Owe argued that the agents took the same amount of time to 
finish a round. The exception to this is the last round where the manipulator 
does not eat any slowdown houses and therefore finishes the round at the same 
time as group 1. Group 2 finishes the round i before group 1 since the manip- 
ulator spent ^ time eating a house with them whereas he spent ^ time eating a 
house with agents from group 2. This results in a | — | = | delay between the 
two. □ 

The manipulator’s choice corresponds to an assignment of the variables in 
the SAT formula. If the manipulator chose to eat house h r x before hf x then 
this corresponds to setting x r to true (and vice versa). Thus, in each round 
the manipulator choses an assignment for a variable in the formula. The target 
utility T is the sum of | of the utility of h Xr andyg of the utility of the slowdown 
house h r s (except in the last round) for each round r and an extra 

Lemma 7. In the clause round, the manipulator must eat the prize house before 
any other agent to reach the target utility T. 

Lemma 8 . F is satisfiable iff the manipulator can reach the target utility T. 

Proof. (=>) We have set T so that if the manipulator declared a prescribed 
preference profile, he will require an extra || — e • n utility to reach T. If all 
clauses are satisfied, at most 2 of the agents eating the houses corresponding to 
a clause will be in advance and the manipulator will have 11 units of time to eat 
the prize house alone. The manipulator will always have | units of time to eat 
the prize house alone while the other literal agents are eating the corresponding 
clause houses. In the worst case, 2 agents are in advance for any clause by 
units of time, which means that they, along with the third agent in the clause, 
will finish their triplet of clause houses after (7 + ^7 units of time, at which time 
all three agents will begin eating the prize house. This leaves the manipulator 
to eat alone for extra time thus ensuring him extra utility > 

(<S=) If the truth assignment causes a clause to be unsatisfied, the agents corre¬ 
sponding to the negation of the literal in the clause (and therefore eating the 
clause houses corresponding to the clause) will all be in advance and will finish 
eating the clause houses before the manipulator has eaten || of the prize house. 
If all 3 agents are in advance, they will finish eating the clause houses || units 
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of time after the manipulator has started eating the prize house. Therefore for 
of the prize house there are at least 3 extra agents eating the prize house. 
Since this makes at least 4 agents eating dh 0 f the prize house, the manipulator 
will get at most A. instead of the required ^ of the prize house after he has 
eaten a share of . Since the prize house is the only remaining house with 
non-negligible utility, and we have made a large enough, he cannot compensate 
this loss of utility by getting more of some other house. □ 

□ 

The reduction can be used to show that even checking whether there exists 
any report that yields more utility than the truthful report is NP-hard. 

Example 4. We illustrate the reduction in the proof of Theorem 0 For the 
following SAT formula, the table below illustrates the preference profile for the 
agents in the main part. Houses not shown in the preferences are never eaten 
by the agents and come later in the preference lists. 


(Xl Vl2V X3) (->Xl V -iX2 V -1X3) {Xl V -i£2 V X3) (-’*1 V X2 V -1*3) 

Cl C2 C3 C4 



choice round 1 


choice round 2 


choice round 3 

clause round 


1 


h l s 

h lFi„ 

hj 

hi,, hi ' 

^Iprize 


a ii 

K, 


hi 


hi 

hi, hi, hi. 

Zl’prize 

< 

hi 


hi 


hi. 

h l, h 2 c „ hi 

^Iprize 


hi,. 


hi x 


hi. 

hi, hi,, hi 

^prize 


hi,. 


hi. 


hi,. 

h\ n , , h jj. 

^■prize 

< 

hi 


h i 


hi 

hi, hi,, hi. 

hprize 

< 

hi 


hi 


hi 

hi, h 2 c „ h 3 c . 

^Iprize 


hi,. 


hi,. 


hi. 

"'Cl > ,L C 1 5 ,t c 1 

hprize 


hl„ 


hi,. 


hi,. 

h l,hl,hl 

^■prize 

< 

hi 


h l 


h l 

hi, hi, hi. 

^prize 

< 

h 1 
n x 3 


hi 


hi 

hi, hi, h 3 ,. 

^■prize 

a i X3 



hix 3 


h 6 

hl.hl.hf 


Q '*3 

Ak_ 


_ 


< _ 

h 1 h 2 h 3 

'"Cn > l c 3 > *"ca 

^prize 


6 . Conclusions 

We conducted a detailed computational analysis of strategic aspects of the 
PS rule. Since PS performs better than RSD in terms of efficiency and envy- 
freeness, the only drawback it has in comparison with RSD is its manipulability. 
We have shown that although PS is manipulable, finding an optimal manipula¬ 
tion is a complex task for an agent even if he has complete knowledge about the 
preferences of other agents. There is scope to conduct detailed experiments on 
the pecentage of instances that are manipulable and the extent and effects of 
manipulation. Initial experiments show that manipulation is often possible and 
more often decreases social welfare than increases it, though the overall effect 
is small. As the number of houses relative to the number of agents grows, the 
opportunities to manipulate increase, maximizing around 99%. It will be inter¬ 
esting to extend our results to the extension of PS for indifferences [l4[. Finally, 
studying coalitional manipulations and a deeper analysis of Nash dynamics are 
other interesting directions. 
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